26 research outputs found

    Continuous-Flow Matrix Transposition Using Memories

    Get PDF
    In this paper, we analyze how to calculate the matrix transposition in continuous flow by using a memory or group of memories. The proposed approach studies this problem for specific conditions such as square and non-square matrices, use of limited access memories and use of several memories in parallel. Contrary to previous approaches, which are based on specific cases or examples, the proposed approach derives the fundamental theory involved in the problem of matrix transposition in a continuous flow. This allows for obtaining the exact equations for the read and write addresses of the memories and other control signals in the circuits. Furthermore, the cases that involve non-square matrices, which have not been studied in detail in the literature, are analyzed in depth in this paper. Experimental results show that the proposed approach is capable of transposing matrices of 8192 times 8192 32-bit data received in series at a rate of 200 mega samples per second, which doubles the throughput of previous approaches. © 2004-2012 IEEE

    Evaluation of penalty functions for semi-global matching cost aggregation

    Get PDF
    The stereo matching method semi-global matching (SGM) relies on consistency constraints during the cost aggregation which are enforced by so-called penalty terms. This paper proposes new and evaluates four penalty functions for SGM. Due to mutual dependencies, two types of matching cost calculation, census and rank transform, are considered. Performance is measured using original and degenerated images exhibiting radiometric changes and noise from the Middlebury benchmark. The two best performing penalty functions are inversely proportional and negatively linear to the intensity gradient and perform equally with 6.05 % and 5.91 % average error, respectively. The experiments also show that adaptive penalty terms are mandatory when dealing with difficult imaging conditions. Consequently, for highest algorithmic performance in real-world systems, selection of a suitable penalty function and thorough parametrization with respect to the expected image quality is essential.Stifterverband fĂĽr die deutsche Wissenschaf

    Continuous-Flow Matrix Transposition Using Memories

    No full text

    Signal Processing Systems - Design and implementation

    No full text

    Hardware-Abbildung eines videobasierten Verfahrens zur echtzeitfähigen Auswertung von Winkelhistogrammen auf eine modulare Coprozessor-Architektur [Hardware mapping of a video-based approach for real-time evaluation of angular histograms on a modular coprocessor architecture]

    No full text
    This paper presents the mapping of a video-based approach for real-time evaluation of angular histograms on a modular coprocessor architecture. The architecture comprises several dedicated processing elements for parallel processing of computation-intensive image processing tasks and is coupled with a RISC processor. A configurable architecture extension, especially a processing element for evaluating angular histograms of objects in conjunction with a RISC processor, provides a real-time classification. Depending on the configuration of the architecture extension, 3 300 to 12 000 look-up tables are required for a Xilinx Virtex-5 FPGA implementation. Running at a clock frequency of 100 MHz and independently of the image resolution per frame, 100 objects of size 256 × 256 pixels are analyzed in a 25 Hz video stream by the architecture.Dieser Beitrag behandelt die Abbildung eines videobasierten Verfahrens zur echtzeitfähigen Auswertung von Winkelhistogrammen auf eine modulare Coprozessor-Architektur. Die Architektur besteht aus mehreren dedizierten Recheneinheiten zur parallelen Verarbeitung rechenintensiver Bildverarbeitungsverfahren und ist mit einem RISC-Prozessor verbunden. Eine konfigurierbare Architekturerweiterung um eine Recheneinheit zur Auswertung von Winkelhistogrammen von Objekten ermöglicht in Verbindung mit dem RISC eine echtzeitfähige Klassifikation. Je nach Konfiguration sind für die Architekturerweiterung auf einem Xilinx Virtex-5-FPGA zwischen 3300 und 12 000 Lookup-Tables erforderlich. Bei einer Taktfrequenz von 100 MHz können unabhängig von der Bildauflösung pro Einzelbild in einem 25-Hz-Videodatenstrom bis zu 100 Objekte der Größe 256×256 Pixel analysiert werden

    A Video Signal Processor for MIMD Multiprocessing

    No full text
    The video signal processor AxPe1280V has been developed for implementation of different video coding applications according to standards like ITU-T H.261/H.263, and ISO MPEG-1/2. It consists of a RISC processor supplemented by a coprocessor for convolution-like low-level tasks. RISC and coprocessor have been implemented in a standard cell design combined with full-custom modules. The processor was fabricated in a 0.5 #m CMOS technology and has a die size of 82 mm 2 . It provides a peak performance of more than 1 giga arithmetic operations per second (GOPS) at 66 MHz. For processing of very computation-intensive algorithms or high data rates, several processors can be bus-connected to form a MIMD multiprocessor system. 1. INTRODUCTION For real-time coding and transmission of video data, several international standards have been developed. These include ITU-T H.261/H.263 [1, 2] for video telephone, ISO MPEG-1 [3] for multimedia, and ISO MPEG-2 [4] for digital TV. All these standards ..
    corecore